Search Results: "ras"

25 February 2024

Russ Allbery: Review: The Fund

Review: The Fund, by Rob Copeland
Publisher: St. Martin's Press
Copyright: 2023
ISBN: 1-250-27694-2
Format: Kindle
Pages: 310
I first became aware of Ray Dalio when either he or his publisher plastered advertisements for The Principles all over the San Francisco 4th and King Caltrain station. If I recall correctly, there were also constant radio commercials; it was a whole thing in 2017. My brain is very good at tuning out advertisements, so my only thought at the time was "some business guy wrote a self-help book." I think I vaguely assumed he was a CEO of some traditional business, since that's usually who writes heavily marketed books like this. I did not connect him with hedge funds or Bridgewater, which I have a bad habit of confusing with Blackwater. The Principles turns out to be more of a laundered cult manual than a self-help book. And therein lies a story. Rob Copeland is currently with The New York Times, but for many years he was the hedge fund reporter for The Wall Street Journal. He covered, among other things, Bridgewater Associates, the enormous hedge fund founded by Ray Dalio. The Fund is a biography of Ray Dalio and a history of Bridgewater from its founding as a vehicle for Dalio's advising business until 2022 when Dalio, after multiple false starts and title shuffles, finally retired from running the company. (Maybe. Based on the history recounted here, it wouldn't surprise me if he was back at the helm by the time you read this.) It is one of the wildest, creepiest, and most abusive business histories that I have ever read. It's probably worth mentioning, as Copeland does explicitly, that Ray Dalio and Bridgewater hate this book and claim it's a pack of lies. Copeland includes some of their denials (and many non-denials that sound as good as confirmations to me) in footnotes that I found increasingly amusing.
A lawyer for Dalio said he "treated all employees equally, giving people at all levels the same respect and extending them the same perks."
Uh-huh. Anyway, I personally know nothing about Bridgewater other than what I learned here and the occasional mention in Matt Levine's newsletter (which is where I got the recommendation for this book). I have no independent information whether anything Copeland describes here is true, but Copeland provides the typical extensive list of notes and sourcing one expects in a book like this, and Levine's comments indicated it's generally consistent with Bridgewater's industry reputation. I think this book is true, but since the clear implication is that the world's largest hedge fund was primarily a deranged cult whose employees mostly spied on and rated each other rather than doing any real investment work, I also have questions, not all of which Copeland answers to my satisfaction. But more on that later. The center of this book are the Principles. These were an ever-changing list of rules and maxims for how people should conduct themselves within Bridgewater. Per Copeland, although Dalio later published a book by that name, the version of the Principles that made it into the book was sanitized and significantly edited down from the version used inside the company. Dalio was constantly adding new ones and sometimes changing them, but the common theme was radical, confrontational "honesty": never being silent about problems, confronting people directly about anything that they did wrong, and telling people all of their faults so that they could "know themselves better." If this sounds like textbook abusive behavior, you have the right idea. This part Dalio admits to openly, describing Bridgewater as a firm that isn't for everyone but that achieves great results because of this culture. But the uncomfortably confrontational vibes are only the tip of the iceberg of dysfunction. Here are just a few of the ways this played out according to Copeland: In one of the common and all-too-disturbing connections between Wall Street finance and the United States' dysfunctional government, James Comey (yes, that James Comey) ran internal security for Bridgewater for three years, meaning that he was the one who pulled evidence from surveillance cameras for Dalio to use to confront employees during his trials. In case the cult vibes weren't strong enough already, Bridgewater developed its own idiosyncratic language worthy of Scientology. The trials were called "probings," firing someone was called "sorting" them, and rating them was called "dotting," among many other Bridgewater-specific terms. Needless to say, no one ever probed Dalio himself. You will also be completely unsurprised to learn that Copeland documents instances of sexual harassment and discrimination at Bridgewater, including some by Dalio himself, although that seems to be a relatively small part of the overall dysfunction. Dalio was happy to publicly humiliate anyone regardless of gender. If you're like me, at this point you're probably wondering how Bridgewater continued operating for so long in this environment. (Per Copeland, since Dalio's retirement in 2022, Bridgewater has drastically reduced the cult-like behaviors, deleted its archive of probings, and de-emphasized the Principles.) It was not actually a religious cult; it was a hedge fund that has to provide investment services to huge, sophisticated clients, and by all accounts it's a very successful one. Why did this bizarre nightmare of a workplace not interfere with Bridgewater's business? This, I think, is the weakest part of this book. Copeland makes a few gestures at answering this question, but none of them are very satisfying. First, it's clear from Copeland's account that almost none of the employees of Bridgewater had any control over Bridgewater's investments. Nearly everyone was working on other parts of the business (sales, investor relations) or on cult-related obsessions. Investment decisions (largely incorporated into algorithms) were made by a tiny core of people and often by Dalio himself. Bridgewater also appears to not trade frequently, unlike some other hedge funds, meaning that they probably stay clear of the more labor-intensive high-frequency parts of the business. Second, Bridgewater took off as a hedge fund just before the hedge fund boom in the 1990s. It transformed from Dalio's personal consulting business and investment newsletter to a hedge fund in 1990 (with an earlier investment from the World Bank in 1987), and the 1990s were a very good decade for hedge funds. Bridgewater, in part due to Dalio's connections and effective marketing via his newsletter, became one of the largest hedge funds in the world, which gave it a sort of institutional momentum. No one was questioned for putting money into Bridgewater even in years when it did poorly compared to its rivals. Third, Dalio used the tried and true method of getting free publicity from the financial press: constantly predict an upcoming downturn, and aggressively take credit whenever you were right. From nearly the start of his career, Dalio predicted economic downturns year after year. Bridgewater did very well in the 2000 to 2003 downturn, and again during the 2008 financial crisis. Dalio aggressively takes credit for predicting both of those downturns and positioning Bridgewater correctly going into them. This is correct; what he avoids mentioning is that he also predicted downturns in every other year, the majority of which never happened. These points together create a bit of an answer, but they don't feel like the whole picture and Copeland doesn't connect the pieces. It seems possible that Dalio may simply be good at investing; he reads obsessively and clearly enjoys thinking about markets, and being an abusive cult leader doesn't take up all of his time. It's also true that to some extent hedge funds are semi-free money machines, in that once you have a sufficient quantity of money and political connections you gain access to investment opportunities and mechanisms that are very likely to make money and that the typical investor simply cannot access. Dalio is clearly good at making personal connections, and invested a lot of effort into forming close ties with tricky clients such as pools of Chinese money. Perhaps the most compelling explanation isn't mentioned directly in this book but instead comes from Matt Levine. Bridgewater touts its algorithmic trading over humans making individual trades, and there is some reason to believe that consistently applying an algorithm without regard to human emotion is a solid trading strategy in at least some investment areas. Levine has asked in his newsletter, tongue firmly in cheek, whether the bizarre cult-like behavior and constant infighting is a strategy to distract all the humans and keep them from messing with the algorithm and thus making bad decisions. Copeland leaves this question unsettled. Instead, one comes away from this book with a clear vision of the most dysfunctional workplace I have ever heard of, and an endless litany of bizarre events each more astonishing than the last. If you like watching train wrecks, this is the book for you. The only drawback is that, unlike other entries in this genre such as Bad Blood or Billion Dollar Loser, Bridgewater is a wildly successful company, so you don't get the schadenfreude of seeing a house of cards collapse. You do, however, get a helpful mental model to apply to the next person who tries to talk to you about "radical honesty" and "idea meritocracy." The flaw in this book is that the existence of an organization like Bridgewater is pointing to systematic flaws in how our society works, which Copeland is largely uninterested in interrogating. "How could this have happened?" is a rather large question to leave unanswered. The sheer outrageousness of Dalio's behavior also gets a bit tiring by the end of the book, when you've seen the patterns and are hearing about the fourth variation. But this is still an astonishing book, and a worthy entry in the genre of capitalism disasters. Rating: 7 out of 10

Jacob Adams: AAC and Debian

Currently, in a default installation of Debian with the GNOME desktop, Bluetooth headphones that require the AAC codec1 cannot be used. As the Debian wiki outlines, using the AAC codec over Bluetooth, while technically supported by PipeWire, is explicitly disabled in Debian at this time. This is because the fdk-aac library needed to enable this support is currently in the non-free component of the repository, meaning that PipeWire, which is in the main component, cannot depend on it.

How to Fix it Yourself If what you, like me, need is simply for Bluetooth Audio to work with AAC in Debian s default desktop environment2, then you ll need to rebuild the pipewire package to include the AAC codec. While the current version in Debian main has been built with AAC deliberately disabled, it is trivial to enable if you can install a version of the fdk-aac library. I preface this with the usual caveats when it comes to patent and licensing controversies. I am not a lawyer, building this package and/or using it could get you into legal trouble. These instructions have only been tested on an up-to-date copy of Debian 12.
  1. Install pipewire s build dependencies
    sudo apt install build-essential devscripts
    sudo apt build-dep pipewire
    
  2. Install libfdk-aac-dev
    sudo apt install libfdk-aac-dev
    
    If the above doesn t work you ll likely need to enable non-free and try again
    sudo sed -i 's/main/main non-free/g' /etc/apt/sources.list
    sudo apt update
    
    Alternatively, if you wish to ensure you are maximally license-compliant and patent un-infringing3, you can instead build fdk-aac-free which includes only those components of AAC that are known to be patent-free3. This is what should eventually end up in Debian to resolve this problem (see below).
    sudo apt install git-buildpackage
    mkdir fdk-aac-source
    cd fdk-aac-source
    git clone https://salsa.debian.org/multimedia-team/fdk-aac
    cd fdk-aac
    gbp buildpackage
    sudo dpkg -i ../libfdk-aac2_*deb ../libfdk-aac-dev_*deb
    
  3. Get the pipewire source code
    mkdir pipewire-source
    cd pipewire-source
    apt source pipewire
    
    This will create a bunch of files within the pipewire-source directory, but you ll only need the pipewire-<version> folder, this contains all the files you ll need to build the package, with all the debian-specific patches already applied. Note that you don t want to run the apt source command as root, as it will then create files that your regular user cannot edit.
  4. Fix the dependencies and build options To fix up the build scripts to use the fdk-aac library, you need to save the following as pipewire-source/aac.patch
    --- debian/control.orig
    +++ debian/control
    @@ -40,8 +40,8 @@
                 modemmanager-dev,
                 pkg-config,
                 python3-docutils,
    -               systemd [linux-any]
    -Build-Conflicts: libfdk-aac-dev
    +               systemd [linux-any],
    +               libfdk-aac-dev
     Standards-Version: 4.6.2
     Vcs-Browser: https://salsa.debian.org/utopia-team/pipewire
     Vcs-Git: https://salsa.debian.org/utopia-team/pipewire.git
    --- debian/rules.orig
    +++ debian/rules
    @@ -37,7 +37,7 @@
     		-Dauto_features=enabled \
     		-Davahi=enabled \
     		-Dbluez5-backend-native-mm=enabled \
    -		-Dbluez5-codec-aac=disabled \
    +		-Dbluez5-codec-aac=enabled \
     		-Dbluez5-codec-aptx=enabled \
     		-Dbluez5-codec-lc3=enabled \
     		-Dbluez5-codec-lc3plus=disabled \
    
    Then you ll need to run patch from within the pipewire-<version> folder created by apt source:
    patch -p0 < ../aac.patch
    
  5. Build pipewire
    cd pipewire-*
    debuild
    
    Note that you will likely see an error from debsign at the end of this process, this is harmless, you simply don t have a GPG key set up to sign your newly-built package4. Packages don t need to be signed to be installed, and debsign uses a somewhat non-standard signing process that dpkg does not check anyway.
  1. Install libspa-0.2-bluetooth
    sudo dpkg -i libspa-0.2-bluetooth_*.deb
    
  2. Restart PipeWire and/or Reboot
    sudo reboot
    
    Theoretically there s a set of services to restart here that would get pipewire to pick up the new library, probably just pipewire itself. But it s just as easy to restart and ensure everything is using the correct library.

Why This is a slightly unusual situation, as the fdk-aac library is licensed under what even the GNU project acknowledges is a free software license. However, this license explicitly informs the user that they need to acquire a patent license to use this software5:
3. NO PATENT LICENSE NO EXPRESS OR IMPLIED LICENSES TO ANY PATENT CLAIMS, including without limitation the patents of Fraunhofer, ARE GRANTED BY THIS SOFTWARE LICENSE. Fraunhofer provides no warranty of patent non-infringement with respect to this software. You may use this FDK AAC Codec software or modifications thereto only for purposes that are authorized by appropriate patent licenses.
To quote the GNU project:
Because of this, and because the license author is a known patent aggressor, we encourage you to be careful about using or redistributing software under this license: you should first consider whether the licensor might aim to lure you into patent infringement.
AAC is covered by a number of patents, which expire at some point in the 2030s6. As such the current version of the library is potentially legally dubious to ship with any other software, as it could be considered patent-infringing3.

Fedora s solution Since 2017, Fedora has included a modified version of the library as fdk-aac-free, see the announcement and the bugzilla bug requesting review. This version of the library includes only the AAC LC profile, which is believed to be entirely patent-free3. Based on this, there is an open bug report in Debian requesting that the fdk-aac package be moved to the main component and that the pipwire package be updated to build against it.

The Debian NEW queue To resolve these bugs, a version of fdk-aac-free has been uploaded to Debian by Jeremy Bicha. However, to make it into Debian proper, it must first pass through the ftpmaster s NEW queue. The current version of fdk-aac-free has been in the NEW queue since July 2023. Based on conversations in some of the bugs above, it s been there since at least 20227. I hope this helps anyone stuck with AAC to get their hardware working for them while we wait for the package to eventually make it through the NEW queue. Discuss on Hacker News
  1. Such as, for example, any Apple AirPods, which only support AAC AFAICT.
  2. Which, as of Debian 12 is GNOME 3 under Wayland with PipeWire.
  3. I m not a lawyer, I don t know what kinds of infringement might or might not be possible here, do your own research, etc. 2 3 4
  4. And if you DO have a key setup with debsign you almost certainly don t need these instructions.
  5. This was originally phrased as explicitly does not grant any patent rights. It was pointed out on Hacker News that this is not exactly what it says, as it also includes a specific note that you ll need to acquire your own patent license. I ve now quoted the relevant section of the license for clarity.
  6. Wikipedia claims the base patents expire in 2031, with the extensions expiring in 2038, but its source for these claims is some guy s spreadsheet in a forum. The same discussion also brings up Wikipedia s claim and casts some doubt on it, so I m not entirely sure what s correct here, but I didn t feel like doing a patent deep-dive today. If someone can provide a clear answer that would be much appreciated.
  7. According to Jeremy B cha: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1021370#17

19 February 2024

Matthew Garrett: Debugging an odd inability to stream video

We have a cabin out in the forest, and when I say "out in the forest" I mean "in a national forest subject to regulation by the US Forest Service" which means there's an extremely thick book describing the things we're allowed to do and (somewhat longer) not allowed to do. It's also down in the bottom of a valley surrounded by tall trees (the whole "forest" bit). There used to be AT&T copper but all that infrastructure burned down in a big fire back in 2021 and AT&T no longer supply new copper links, and Starlink isn't viable because of the whole "bottom of a valley surrounded by tall trees" thing along with regulations that prohibit us from putting up a big pole with a dish on top. Thankfully there's LTE towers nearby, so I'm simply using cellular data. Unfortunately my provider rate limits connections to video streaming services in order to push them down to roughly SD resolution. The easy workaround is just to VPN back to somewhere else, which in my case is just a Wireguard link back to San Francisco.

This worked perfectly for most things, but some streaming services simply wouldn't work at all. Attempting to load the video would just spin forever. Running tcpdump at the local end of the VPN endpoint showed a connection being established, some packets being exchanged, and then nothing. The remote service appeared to just stop sending packets. Tcpdumping the remote end of the VPN showed the same thing. It wasn't until I looked at the traffic on the VPN endpoint's external interface that things began to become clear.

This probably needs some background. Most network infrastructure has a maximum allowable packet size, which is referred to as the Maximum Transmission Unit or MTU. For ethernet this defaults to 1500 bytes, and these days most links are able to handle packets of at least this size, so it's pretty typical to just assume that you'll be able to send a 1500 byte packet. But what's important to remember is that that doesn't mean you have 1500 bytes of packet payload - that 1500 bytes includes whatever protocol level headers are on there. For TCP/IP you're typically looking at spending around 40 bytes on the headers, leaving somewhere around 1460 bytes of usable payload. And if you're using a VPN, things get annoying. In this case the original packet becomes the payload of a new packet, which means it needs another set of TCP (or UDP) and IP headers, and probably also some VPN header. This still all needs to fit inside the MTU of the link the VPN packet is being sent over, so if the MTU of that is 1500, the effective MTU of the VPN interface has to be lower. For Wireguard, this works out to an effective MTU of 1420 bytes. That means simply sending a 1500 byte packet over a Wireguard (or any other VPN) link won't work - adding the additional headers gives you a total packet size of over 1500 bytes, and that won't fit into the underlying link's MTU of 1500.

And yet, things work. But how? Faced with a packet that's too big to fit into a link, there are two choices - break the packet up into multiple smaller packets ("fragmentation") or tell whoever's sending the packet to send smaller packets. Fragmentation seems like the obvious answer, so I'd encourage you to read Valerie Aurora's article on how fragmentation is more complicated than you think. tl;dr - if you can avoid fragmentation then you're going to have a better life. You can explicitly indicate that you don't want your packets to be fragmented by setting the Don't Fragment bit in your IP header, and then when your packet hits a link where your packet exceeds the link MTU it'll send back a packet telling the remote that it's too big, what the actual MTU is, and the remote will resend a smaller packet. This avoids all the hassle of handling fragments in exchange for the cost of a retransmit the first time the MTU is exceeded. It also typically works these days, which wasn't always the case - people had a nasty habit of dropping the ICMP packets telling the remote that the packet was too big, which broke everything.

What I saw when I tcpdumped on the remote VPN endpoint's external interface was that the connection was getting established, and then a 1500 byte packet would arrive (this is kind of the behaviour you'd expect for video - the connection handshaking involves a bunch of relatively small packets, and then once you start sending the video stream itself you start sending packets that are as large as possible in order to minimise overhead). This 1500 byte packet wouldn't fit down the Wireguard link, so the endpoint sent back an ICMP packet to the remote telling it to send smaller packets. The remote should then have sent a new, smaller packet - instead, about a second after sending the first 1500 byte packet, it sent that same 1500 byte packet. This is consistent with it ignoring the ICMP notification and just behaving as if the packet had been dropped.

All the services that were failing were failing in identical ways, and all were using Fastly as their CDN. I complained about this on social media and then somehow ended up in contact with the engineering team responsible for this sort of thing - I sent them a packet dump of the failure, they were able to reproduce it, and it got fixed. Hurray!

(Between me identifying the problem and it getting fixed I was able to work around it. The TCP header includes a Maximum Segment Size (MSS) field, which indicates the maximum size of the payload for this connection. iptables allows you to rewrite this, so on the VPN endpoint I simply rewrote the MSS to be small enough that the packets would fit inside the Wireguard MTU. This isn't a complete fix since it's done at the TCP level rather than the IP level - so any large UDP packets would still end up breaking)

I've no idea what the underlying issue was, and at the client end the failure was entirely opaque: the remote simply stopped sending me packets. The only reason I was able to debug this at all was because I controlled the other end of the VPN as well, and even then I wouldn't have been able to do anything about it other than being in the fortuitous situation of someone able to do something about it seeing my post. How many people go through their lives dealing with things just being broken and having no idea why, and how do we fix that?

(Edit: thanks to this comment, it sounds like the underlying issue was a kernel bug that Fastly developed a fix for - under certain configurations, the kernel fails to associate the MTU update with the egress interface and so it continues sending overly large packets)

comment count unavailable comments

Valhalla's Things: Jeans, step one

Posted on February 19, 2024
Tags: madeof:atoms, craft:sewing, FreeSoftWear
CW for body size change mentions A woman wearing a pair of tight jeans. Just like the corset, I also needed a new pair of jeans. Back when my body size changed drastically of course my jeans no longer fit. While I was waiting for my size to stabilize I kept wearing them with a somewhat tight belt, but it was ugly and somewhat uncomfortable. When I had stopped changing a lot I tried to buy new ones in the same model, and found out that I was too thin for the menswear jeans of that shop. I could have gone back to wearing women s jeans, but I didn t want to have to deal with the crappy fabric and short pockets, so I basically spent a few years wearing mostly skirts, and oversized jeans when I really needed trousers. Meanwhile, I had drafted a jeans pattern for my SO, which we had planned to make in technical fabric, but ended up being made in a cotton-wool mystery mix for winter and in linen-cotton for summer, and the technical fabric version was no longer needed (yay for natural fibres!) It was clear what the solution to my jeans problems would have been, I just had to stop getting distracted by other projects and draft a new pattern using a womanswear block instead of a menswear one. Which, in January 2024 I finally did, and I believe it took a bit less time than the previous one, even if it had all of the same fiddly pieces. I already had a cut of the same cotton-linen I had used for my SO, except in black, and used it to make the pair this post is about. The parametric pattern is of course online, as #FreeSoftWear, at the usual place. This time it was faster, since I didn t have to write step-by-step instructions, as they are exactly the same as the other pattern. Same as above, from the back, with the crotch seam pulling a bit. A faint decoration can be seen on the pockets, with the line art version of the logo seen on this blog. Making also went smoothly, and the result was fitting. Very fitting. A big too fitting, and the standard bum adjustment of the back was just enough for what apparently still qualifies as a big bum, so I adjusted the pattern to be able to add a custom amount of ease in a few places. But at least I had a pair of jeans-shaped trousers that fit! Except, at 200 g/m I can t say that fabric is the proper weight for a pair of trousers, and I may have looked around online1 for some denim, and, well, it s 2024, so my no-fabric-buy 2023 has not been broken, right? Let us just say that there may be other jeans-related posts in the near future.

  1. I had already asked years ago for denim at my local fabric shops, but they don t have the proper, sturdy, type I was looking for.

13 February 2024

Matthew Palmer: Not all TLDs are Created Equal

In light of the recent cancellation of the queer.af domain registration by the Taliban, the fragile and difficult nature of country-code top-level domains (ccTLDs) has once again been comprehensively demonstrated. Since many people may not be aware of the risks, I thought I d give a solid explainer of the whole situation, and explain why you should, in general, not have anything to do with domains which are registered under ccTLDs.

Top-level What-Now? A top-level domain (TLD) is the last part of a domain name (the collection of words, separated by periods, after the https:// in your web browser s location bar). It s the com in example.com, or the af in queer.af. There are two kinds of TLDs: country-code TLDs (ccTLDs) and generic TLDs (gTLDs). Despite all being TLDs, they re very different beasts under the hood.

What s the Difference? Generic TLDs are what most organisations and individuals register their domains under: old-school technobabble like com , net , or org , historical oddities like gov , and the new-fangled world of words like tech , social , and bank . These gTLDs are all regulated under a set of rules created and administered by ICANN (the Internet Corporation for Assigned Names and Numbers ), which try to ensure that things aren t a complete wild-west, limiting things like price hikes (well, sometimes, anyway), and providing means for disputes over names1. Country-code TLDs, in contrast, are all two letters long2, and are given out to countries to do with as they please. While ICANN kinda-sorta has something to do with ccTLDs (in the sense that it makes them exist on the Internet), it has no authority to control how a ccTLD is managed. If a country decides to raise prices by 100x, or cancel all registrations that were made on the 12th of the month, there s nothing anyone can do about it. If that sounds bad, that s because it is. Also, it s not a theoretical problem the Taliban deciding to asssert its bigotry over the little corner of the Internet namespace it has taken control of is far from the first time that ccTLDs have caused grief.

Shifting Sands The queer.af cancellation is interesting because, at the time the domain was reportedly registered, 2018, Afghanistan had what one might describe as, at least, a different political climate. Since then, of course, things have changed, and the new bosses have decided to get a bit more active. Those running queer.af seem to have seen the writing on the wall, and were planning on moving to another, less fraught, domain, but hadn t completed that move when the Taliban came knocking.

The Curious Case of Brexit When the United Kingdom decided to leave the European Union, it fell foul of the EU s rules for the registration of domains under the eu ccTLD3. To register (and maintain) a domain name ending in .eu, you have to be a resident of the EU. When the UK ceased to be part of the EU, residents of the UK were no longer EU residents. Cue much unhappiness, wailing, and gnashing of teeth when this was pointed out to Britons. Some decided to give up their domains, and move to other parts of the Internet, while others managed to hold onto them by various legal sleight-of-hand (like having an EU company maintain the registration on their behalf). In any event, all very unpleasant for everyone involved.

Geopolitics on the Internet?!? After Russia invaded Ukraine in February 2022, the Ukranian Vice Prime Minister asked ICANN to suspend ccTLDs associated with Russia. While ICANN said that it wasn t going to do that, because it wouldn t do anything useful, some domain registrars (the companies you pay to register domain names) ceased to deal in Russian ccTLDs, and some websites restricted links to domains with Russian ccTLDs. Whether or not you agree with the sort of activism implied by these actions, the fact remains that even the actions of a government that aren t directly related to the Internet can have grave consequences for your domain name if it s registered under a ccTLD. I don t think any gTLD operator will be invading a neighbouring country any time soon.

Money, Money, Money, Must Be Funny When you register a domain name, you pay a registration fee to a registrar, who does administrative gubbins and causes you to be able to control the domain name in the DNS. However, you don t own that domain name4 you re only renting it. When the registration period comes to an end, you have to renew the domain name, or you ll cease to be able to control it. Given that a domain name is typically your brand or identity online, the chances are you d prefer to keep it over time, because moving to a new domain name is a massive pain, having to tell all your customers or users that now you re somewhere else, plus having to accept the risk of someone registering the domain name you used to have and capturing your traffic it s all a gigantic hassle. For gTLDs, ICANN has various rules around price increases and bait-and-switch pricing that tries to keep a lid on the worst excesses of registries. While there are any number of reasonable criticisms of the rules, and the Internet community has to stay on their toes to keep ICANN from totally succumbing to regulatory capture, at least in the gTLD space there s some degree of control over price gouging. On the other hand, ccTLDs have no effective controls over their pricing. For example, in 2008 the Seychelles increased the price of .sc domain names from US$25 to US$75. No reason, no warning, just pay up .

Who Is Even Getting That Money? A closely related concern about ccTLDs is that some of the cool ones are assigned to countries that are not great. The poster child for this is almost certainly Libya, which has the ccTLD ly . While Libya was being run by a terrorist-supporting extremist, companies thought it was a great idea to have domain names that ended in .ly. These domain registrations weren t (and aren t) cheap, and it s hard to imagine that at least some of that money wasn t going to benefit the Gaddafi regime. Similarly, the British Indian Ocean Territory, which has the io ccTLD, was created in a colonialist piece of chicanery that expelled thousands of native Chagossians from Diego Garcia. Money from the registration of .io domains doesn t go to the (former) residents of the Chagos islands, instead it gets paid to the UK government. Again, I m not trying to suggest that all gTLD operators are wonderful people, but it s not particularly likely that the direct beneficiaries of the operation of a gTLD stole an island chain and evicted the residents.

Are ccTLDs Ever Useful? The answer to that question is an unqualified maybe . I certainly don t think it s a good idea to register a domain under a ccTLD for vanity purposes: because it makes a word, is the same as a file extension you like, or because it looks cool. Those ccTLDs that clearly represent and are associated with a particular country are more likely to be OK, because there is less impetus for the registry to try a naked cash grab. Unfortunately, ccTLD registries have a disconcerting habit of changing their minds on whether they serve their geographic locality, such as when auDA decided to declare an open season in the .au namespace some years ago. Essentially, while a ccTLD may have geographic connotations now, there s not a lot of guarantee that they won t fall victim to scope creep in the future. Finally, it might be somewhat safer to register under a ccTLD if you live in the location involved. At least then you might have a better idea of whether your domain is likely to get pulled out from underneath you. Unfortunately, as the .eu example shows, living somewhere today is no guarantee you ll still be living there tomorrow, even if you don t move house. In short, I d suggest sticking to gTLDs. They re at least lower risk than ccTLDs.

+1, Helpful If you ve found this post informative, why not buy me a refreshing beverage? My typing fingers (both of them) thank you in advance for your generosity.

Footnotes
  1. don t make the mistake of thinking that I approve of ICANN or how it operates; it s an omnishambles of poor governance and incomprehensible decision-making.
  2. corresponding roughly, though not precisely (because everything has to be complicated, because humans are complicated), to the entries in the ISO standard for Codes for the representation of names of countries and their subdivisions , ISO 3166.
  3. yes, the EU is not a country; it s part of the roughly, though not precisely caveat mentioned previously.
  4. despite what domain registrars try very hard to imply, without falling foul of deceptive advertising regulations.

12 February 2024

Freexian Collaborators: Monthly report about Debian Long Term Support, January 2024 (by Roberto C. S nchez)

Like each month, have a look at the work funded by Freexian s Debian LTS offering.

Debian LTS contributors In January, 16 contributors have been paid to work on Debian LTS, their reports are available:
  • Abhijith PA did 14.0h (out of 7.0h assigned and 7.0h from previous period).
  • Bastien Roucari s did 22.0h (out of 16.0h assigned and 6.0h from previous period).
  • Ben Hutchings did 14.5h (out of 8.0h assigned and 16.0h from previous period), thus carrying over 9.5h to the next month.
  • Chris Lamb did 18.0h (out of 18.0h assigned).
  • Daniel Leidert did 10.0h (out of 10.0h assigned).
  • Emilio Pozuelo Monfort did 10.0h (out of 14.75h assigned and 27.0h from previous period), thus carrying over 31.75h to the next month.
  • Guilhem Moulin did 9.75h (out of 25.0h assigned), thus carrying over 15.25h to the next month.
  • Holger Levsen did 3.5h (out of 12.0h assigned), thus carrying over 8.5h to the next month.
  • Markus Koschany did 40.0h (out of 40.0h assigned).
  • Roberto C. S nchez did 8.75h (out of 9.5h assigned and 2.5h from previous period), thus carrying over 3.25h to the next month.
  • Santiago Ruano Rinc n did 13.5h (out of 8.25h assigned and 7.75h from previous period), thus carrying over 2.5h to the next month.
  • Sean Whitton did 0.5h (out of 0.25h assigned and 5.75h from previous period), thus carrying over 5.5h to the next month.
  • Sylvain Beucler did 9.5h (out of 23.25h assigned and 18.5h from previous period), thus carrying over 32.25h to the next month.
  • Thorsten Alteholz did 14.0h (out of 14.0h assigned).
  • Tobias Frost did 12.0h (out of 10.25h assigned and 1.75h from previous period).
  • Utkarsh Gupta did 8.5h (out of 35.75h assigned), thus carrying over 24.75h to the next month.

Evolution of the situation In January, we have released 25 DLAs. A variety of particularly notable packages were updated during January. Among those updates were the Linux kernel (both versions 5.10 and 4.19), mariadb-10.3, openjdk-11, firefox-esr, and thunderbird. In addition to the many other LTS package updates which were released in January, LTS contributors continue their efforts to make impactful contributions both within the Debian community.

Thanks to our sponsors Sponsors that joined recently are in bold.

7 February 2024

Reproducible Builds: Reproducible Builds in January 2024

Welcome to the January 2024 report from the Reproducible Builds project. In these reports we outline the most important things that we have been up to over the past month. If you are interested in contributing to the project, please visit our Contribute page on our website.

How we executed a critical supply chain attack on PyTorch John Stawinski and Adnan Khan published a lengthy blog post detailing how they executed a supply-chain attack against PyTorch, a popular machine learning platform used by titans like Google, Meta, Boeing, and Lockheed Martin :
Our exploit path resulted in the ability to upload malicious PyTorch releases to GitHub, upload releases to [Amazon Web Services], potentially add code to the main repository branch, backdoor PyTorch dependencies the list goes on. In short, it was bad. Quite bad.
The attack pivoted on PyTorch s use of self-hosted runners as well as submitting a pull request to address a trivial typo in the project s README file to gain access to repository secrets and API keys that could subsequently be used for malicious purposes.

New Arch Linux forensic filesystem tool On our mailing list this month, long-time Reproducible Builds developer kpcyrd announced a new tool designed to forensically analyse Arch Linux filesystem images. Called archlinux-userland-fs-cmp, the tool is supposed to be used from a rescue image (any Linux) with an Arch install mounted to, [for example], /mnt. Crucially, however, at no point is any file from the mounted filesystem eval d or otherwise executed. Parsers are written in a memory safe language. More information about the tool can be found on their announcement message, as well as on the tool s homepage. A GIF of the tool in action is also available.

Issues with our SOURCE_DATE_EPOCH code? Chris Lamb started a thread on our mailing list summarising some potential problems with the source code snippet the Reproducible Builds project has been using to parse the SOURCE_DATE_EPOCH environment variable:
I m not 100% sure who originally wrote this code, but it was probably sometime in the ~2015 era, and it must be in a huge number of codebases by now. Anyway, Alejandro Colomar was working on the shadow security tool and pinged me regarding some potential issues with the code. You can see this conversation here.
Chris ends his message with a request that those with intimate or low-level knowledge of time_t, C types, overflows and the various parsing libraries in the C standard library (etc.) contribute with further info.

Distribution updates In Debian this month, Roland Clobus posted another detailed update of the status of reproducible ISO images on our mailing list. In particular, Roland helpfully summarised that all major desktops build reproducibly with bullseye, bookworm, trixie and sid provided they are built for a second time within the same DAK run (i.e. [within] 6 hours) . Additionally 7 of the 8 bookworm images from the official download link build reproducibly at any later time. In addition to this, three reviews of Debian packages were added, 17 were updated and 15 were removed this month adding to our knowledge about identified issues. Elsewhere, Bernhard posted another monthly update for his work elsewhere in openSUSE.

Community updates There were made a number of improvements to our website, including Bernhard M. Wiedemann fixing a number of typos of the term nondeterministic . [ ] and Jan Zerebecki adding a substantial and highly welcome section to our page about SOURCE_DATE_EPOCH to document its interaction with distribution rebuilds. [ ].
diffoscope is our in-depth and content-aware diff utility that can locate and diagnose reproducibility issues. This month, Chris Lamb made a number of changes such as uploading versions 254 and 255 to Debian but focusing on triaging and/or merging code from other contributors. This included adding support for comparing eXtensible ARchive (.XAR/.PKG) files courtesy of Seth Michael Larson [ ][ ], as well considerable work from Vekhir in order to fix compatibility between various and subtle incompatible versions of the progressbar libraries in Python [ ][ ][ ][ ]. Thanks!

Reproducibility testing framework The Reproducible Builds project operates a comprehensive testing framework (available at tests.reproducible-builds.org) in order to check packages and other artifacts for reproducibility. In January, a number of changes were made by Holger Levsen:
  • Debian-related changes:
    • Reduce the number of arm64 architecture workers from 24 to 16. [ ]
    • Use diffoscope from the Debian release being tested again. [ ]
    • Improve the handling when killing unwanted processes [ ][ ][ ] and be more verbose about it, too [ ].
    • Don t mark a job as failed if process marked as to-be-killed is already gone. [ ]
    • Display the architecture of builds that have been running for more than 48 hours. [ ]
    • Reboot arm64 nodes when they hit an OOM (out of memory) state. [ ]
  • Package rescheduling changes:
    • Reduce IRC notifications to 1 when rescheduling due to package status changes. [ ]
    • Correctly set SUDO_USER when rescheduling packages. [ ]
    • Automatically reschedule packages regressing to FTBFS (build failure) or FTBR (build success, but unreproducible). [ ]
  • OpenWrt-related changes:
    • Install the python3-dev and python3-pyelftools packages as they are now needed for the sunxi target. [ ][ ]
    • Also install the libpam0g-dev which is needed by some OpenWrt hardware targets. [ ]
  • Misc:
    • As it s January, set the real_year variable to 2024 [ ] and bump various copyright years as well [ ].
    • Fix a large (!) number of spelling mistakes in various scripts. [ ][ ][ ]
    • Prevent Squid and Systemd processes from being killed by the kernel s OOM killer. [ ]
    • Install the iptables tool everywhere, else our custom rc.local script fails. [ ]
    • Cleanup the /srv/workspace/pbuilder directory on boot. [ ]
    • Automatically restart Squid if it fails. [ ]
    • Limit the execution of chroot-installation jobs to a maximum of 4 concurrent runs. [ ][ ]
Significant amounts of node maintenance was performed by Holger Levsen (eg. [ ][ ][ ][ ][ ][ ][ ] etc.) and Vagrant Cascadian (eg. [ ][ ][ ][ ][ ][ ][ ][ ]). Indeed, Vagrant Cascadian handled an extended power outage for the network running the Debian armhf architecture test infrastructure. This provided the incentive to replace the UPS batteries and consolidate infrastructure to reduce future UPS load. [ ] Elsewhere in our infrastructure, however, Holger Levsen also adjusted the email configuration for @reproducible-builds.org to deal with a new SMTP email attack. [ ]

Upstream patches The Reproducible Builds project tries to detects, dissects and fix as many (currently) unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including: Separate to this, Vagrant Cascadian followed up with the relevant maintainers when reproducibility fixes were not included in newly-uploaded versions of the mm-common package in Debian this was quickly fixed, however. [ ]

If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

6 February 2024

Robert McQueen: Flathub: Pros and Cons of Direct Uploads

I attended FOSDEM last weekend and had the pleasure to participate in the Flathub / Flatpak BOF on Saturday. A lot of the session was used up by an extensive discussion about the merits (or not) of allowing direct uploads versus building everything centrally on Flathub s infrastructure, and related concerns such as automated security/dependency scanning. My original motivation behind the idea was essentially two things. The first was to offer a simpler way forward for applications that use language-specific build tools that resolve and retrieve their own dependencies from the internet. Flathub doesn t allow network access during builds, and so a lot of manual work and additional tooling is currently needed (see Python and Electron Flatpak guides). And the second was to offer a maybe more familiar flow to developers from other platforms who would just build something and then run another command to upload it to the store, without having to learn the syntax of a new build tool. There were many valid concerns raised in the room, and I think on reflection that this is still worth doing, but might not be as valuable a way forward for Flathub as I had initially hoped. Of course, for a proprietary application where Flathub never sees the source or where it s built, whether that binary is uploaded to us or downloaded by us doesn t change much. But for an FLOSS application, a direct upload driven by the developer causes a regression on a number of fronts. We re not getting too hung up on the malicious developer inserts evil code in the binary case because Flathub already works on the model of verifying the developer and the user makes a decision to trust that app we don t review the source after all. But we do lose other things such as our infrastructure building on multiple architectures, and visibility on whether the build environment or upload credentials have been compromised unbeknownst to the developer. There is now a manual review process for when apps change their metadata such as name, icon, license and permissions which would apply to any direct uploads as well. It was suggested that if only heavily sandboxed apps (eg no direct filesystem access without proper use of portals) were permitted to make direct uploads, the impact of such concerns might be somewhat mitigated by the sandboxing. However, it was also pointed out that my go-to example of Electron app developers can upload to Flathub with one command was also a bit of a fiction. At present, none of them would pass that stricter sandboxing requirement. Almost all Electron apps run old versions of Chromium with less complete portal support, needing sandbox escapes to function correctly, and Electron (and Chromium s) sandboxing still needs additional tooling/downstream patching to run inside a Flatpak. Buh-boh. I think for established projects who already ship their own binaries from their own centralised/trusted infrastructure, and for developers who have understandable sensitivities about binary integrity such such as encryption, password or financial tools, it s a definite improvement that we re able to set up direct uploads with such projects with less manual work. There are already quite a few applications including verified ones where the build recipe simply fetches a binary built elsewhere and unpacks it, and if this already done centrally by the developer, repeating the exercise on Flathub s server adds little value. However for the individual developer experience, I think we need to zoom out a bit and think about how to improve this from a tools and infrastructure perspective as we grow Flathub, and as we seek to raise funds for different sources for these improvements. I took notes for everything that was mentioned as a tooling limitation during the BOF, along with a few ideas about how we could improve things, and hope to share these soon as part of an RFP/RFI (Request For Proposals/Request for Information) process. We don t have funding yet but if we have some prospective collaborators to help refine the scope and estimate the cost/effort, we can use this to go and pursue funding opportunities.

2 February 2024

Scarlett Gately Moore: Some exciting news! Kubuntu: I m back!!!

It s official, the Kubuntu Council has hired me part time to work on the 24.04 LTS release, preparation for Plasma 6, and to bring life back into the Distribution. First I want thank the Kubuntu Council for this opportunity and I plan a long and successful journey together!!!! My first week ( I started midweek ): It has been a busy one! Many meet and greets with the team and other interested parties. I had the chance to chat with Mike from Kubuntu Focus and I have to say I am absolutely amazed with the work they have done, and if you are in the market for a new laptop, you must check these out!!! https://kfocus.org Or if you want to try before you buy you can download the OS! All they ask is for an e-mail, which is completely reasonable. Hosting isn t free! Besides, you can opt out anytime and they don t share it with anyone. I look forward to working closely with this project. We now have a Kubuntu Team in KDE invent https://invent.kde.org/teams/distribution-kubuntu if you would like to join us, please don t hesitate to ask! I have started a new Wiki and our first page is the ever important Bug triaging! It is still a WIP but you can check it out here: https://invent.kde.org/teams/distribution-kubuntu/docs/-/wikis/Bug-Triage-Story-WIP , with that said I have started the launchpad work to make tracking our bugs easier buy subscribing kubuntu-bugs to all our packages and creating proper projects for our packages missing them. We have compiled a list of our various documentation links that need updated and Rick Timmis is updating kubuntu.org! Aaron Honeycutt has been busy with the Kubuntu Manual https://github.com/kubuntu-team/kubuntu-manual which is in good shape. We just need to improve our developer story  I have been working on the rather massive Apparmor bug https://bugs.launchpad.net/ubuntu/+source/apparmor/+bug/2046844 with testing the fixes from the ppa and writing profiles for the various KDE packages affected ( pretty much anything that uses webengine ) and making progress there. My next order of business staging Frameworks 5.114 with guidance from our super awesome Rik Mills that has been doing most of the heavy lifting in Kubuntu for many years now. So thank you for that Rik  I will also start on our big transition to the Calamaras Installer! I do have experience here, so I expect it will be a smooth one. I am so excited for the future of Kubuntu and the exciting things to come! With that said, the Kubuntu funding is community donation driven. There is enough to pay me part time for a couple contracts, but it will run out and a full-time contract would be super awesome. I am reaching out to anyone enjoying Kubuntu and want to help with the future of Kubuntu to please consider a donation! We are working on more donation options, but for now you can donate through paypal at https://kubuntu.org/donate/ Thank you!!!!!

Ian Jackson: UPS, the Useless Parcel Service; VAT and fees

I recently had the most astonishingly bad experience with UPS, the courier company. They severely damaged my parcels, and were very bad about UK import VAT, ultimately ending up harassing me on autopilot. The only thing that got their attention was my draft Particulars of Claim for intended legal action. Surprisingly, I got them to admit in writing that the disbursement fee they charge recipients alongside the actual VAT, is just something they made up with no legal basis. What happened Autumn last year I ordered some furniture from a company in Germany. This was to be shipped by them to me by courier. The supplier chose UPS. UPS misrouted one of the three parcels to Denmark. When everything arrived, it had been sat on by elephants. The supplier had to replace most of it, with considerable inconvenience and delay to me, and of course a loss to the supplier. But this post isn t mostly about that. This post is about VAT. You see, import VAT was due, because of fucking Brexit. UPS made a complete hash of collecting that VAT. Their computers can t issue coherent documents, their email helpdesk is completely useless, and their automated debt collection systems run along uninfluenced by any external input. The crazy, including legal threats and escalating late payment fees, continued even after I paid the VAT discrepancy (which I did despite them not yet having provided any coherent calculation for it). This kind of behaviour is a very small and mild version of the kind of things British Gas did to Lisa Ferguson, who eventually won substantial damages for harassment, plus 10K of costs. Having tried asking nicely, and sending stiff letters, I too threatened litigation. I would have actually started a court claim, but it would have included a claim under the Protection from Harassment Act. Those have to be filed under the Part 8 procedure , which involves sending all of the written evidence you re going to use along with the claim form. Collating all that would be a good deal of work, especially since UPS and ControlAccount didn t engage with me at all, so I had no idea which things they might actually dispute. So I decided that before issuing proceedings, I d send them a copy of my draft Particulars of Claim, along with an offer to settle if they would pay me a modest sum and stop being evil robots at me. Rather than me typing the whole tale in again, you can read the full gory details in the PDF of my draft Particulars of Claim. (I ve redacted the reference numbers). Outcome The draft Particulars finally got their attention. UPS sent me an offer: they agreed to pay me 50, in full and final settlement. That was close enough to my offer that I accepted it. I mostly wanted them to stop, and they do seem to have done so. And I ve received the 50. VAT calculation They also finally included an actual explanation of the VAT calculation. It s absurd, but it s not UPS s absurd:
The clearance was entered initially with estimated import charges of 400.03, consisting of 387.83 VAT, and 12.20 disbursement fee. This original entry regrettably did not include the freight cost for calculating the VAT, and as such when submitted for final entry the VAT value was adjusted to include this and an amended invoice was issued for an additional 39.84. HMRC calculate the amount against which VAT is raised using the value of goods, insurance and freight, however they also may apply a VAT adjustment figure. The VAT Adjustment is based on many factors (Incidental costs in regards to a shipment), which includes charge for currency conversion if the invoice does not list values in Sterling, but the main is due to the inland freight from airport of destination to the final delivery point, as this charge varies, for example, from EMA to Edinburgh would be 150, from EMA to Derby would be 1, so each year UPS must supply HMRC with all values incurred for entry build up and they give an average which UPS have to use on the entry build up as the VAT Adjustment. The correct calculation for the import charges is therefore as follows: Goods value divided by exchange rate 2,489.53 EUR / 1.1683 = 2,130.89 GBP Duty: Goods value plus freight (%) 2,130.89 GBP + 5% = 2,237.43 GBP. That total times the duty rate. X 0 % = 0 GBP VAT: Goods value plus freight (100%) 2,130.89 GBP + 0 = 2,130.89 GBP That total plus duty and VAT adjustment 2,130.89 GBP + 0 GBP + 7.49 GBP = 2,348.08 GBP. That total times 20% VAT = 427.67 GBP As detailed above we must confirm that the final VAT charges applied to the shipment were correct, and that no refund of this is therefore due.
This looks very like HMRC-originated nonsense. If only they had put it on the original bills! It s completely ridiculous that it took four months and near-litigation to obtain it. Disbursement fee One more thing. UPS billed me a 12 disbursement fee . When you import something, there s often tax to pay. The courier company pays that to the government, and the consignee pays it to the courier. Usually the courier demands it before final delivery, since otherwise they end up having to chase it as a debt. It is common for parcel companies to add a random fee of their own. As I note in my Particulars, there isn t any legal basis for this. In my own offer of settlement I proposed that UPS should:
State under what principle of English law (such as, what enactment or principle of Common Law), you levy the disbursement fee (or refund it).
To my surprise they actually responded to this in their own settlement letter. (They didn t, for example, mention the harassment at all.) They said (emphasis mine):
A disbursement fee is a fee for amounts paid or processed on behalf of a client. It is an established category of charge used by legal firms, amongst other companies, for billing of various ancillary costs which may be incurred in completion of service. Disbursement fees are not covered by a specific law, nor are they legally prohibited. Regarding UPS disbursement fee this is an administrative charge levied for the use of UPS deferment account to prepay import charges for clearance through CDS. This charge would therefore be billed to the party that is responsible for the import charges, normally the consignee or receiver of the shipment in question. The disbursement fee as applied is legitimate, and as you have stated is a commonly used and recognised charge throughout the courier industry, and I can confirm that this was charged correctly in this instance.
On UPS s analysis, they can just make up whatever fee they like. That is clearly not right (and I don t even need to refer to consumer protection law, which would also make it obviously unlawful). And, that everyone does it doesn t make it lawful. There are so many things that are ubiquitous but unlawful, especially nowadays when much of the legal system - especially consumer protection regulators - has been underfunded to beyond the point of collapse. Next time this comes up I might have a go at getting the fee back. (Obviously I ll have to pay it first, to get my parcel.) ParcelForce and Royal Mail I think this analysis doesn t apply to ParcelForce and (probably) Royal Mail. I looked into this in 2009, and I found that Parcelforce had been given the ability to write their own private laws: Schemes made under section 89 of the Postal Services Act 2000. This is obviously ridiculous but I think it was the law in 2009. I doubt the intervening governments have fixed it. Furniture Oh, yes, the actual furniture. The replacements arrived intact and are great :-).

comment count unavailable comments

1 February 2024

Paul Wise: FLOSS Activities January 2024

Focus This month I didn't have any particular focus. I just worked on issues in my info bubble.

Changes
  • OpenStreetMap: fixed a bunch of broken website URLs
  • Debian pass-otp: oathtool safety
  • reportbug: fix crash
  • Debian BTS usertags: fix Python, Ruby, QA, porter, archive, release tags
  • Debian wiki pages: TransitionUploadHook

Issues

Review
  • Debian BTS usertags: changes for the month
  • Debian screenshots:

Administration
  • Debian wiki: unblock IP addresses, approve accounts

Communication
  • Respond to queries from Debian users and contributors on the mailing lists and IRC

Sponsors All work was done on a volunteer basis.

29 January 2024

Russell Coker: Thinkpad X1 Yoga Gen3

I just bought myself a Thinkpad X1 Yoga Gen3 for $359.10. I have been quite happy with the Thinkpad X1 Carbon Gen5 I ve had for just over a year (apart from my mistake in buying one with lost password) [1] and I normally try to get more use out of a computer than that. If I divide total cost by the time that I ve had it working that comes out to about $1.30 per day. I would pay more than that for a laptop and I have paid much more than that for laptops in the past, but I prefer not to. I was initially tempted to buy a new Thinkpad by the prices of high end X1 devices dropping, this new Yoga has 16G of RAM and a 2560*1440 screen that s a good upgrade from 8G with 1920*1080. The CPU of my new Thinkpad is a quad core i5-8350U that rates 6226 [2] and is a decent upgrade from the dual core i5-6300U that rates 3239 [3] although that wasn t a factor as I found the old CPU fast enough. The Yoga Gen3 has a minimum weight of 1.4Kg and mine might not be the lightest model in the range while the old Carbon weighs 1.14Kg. I can really feel the difference. It s also slightly larger but fortunately still fits in the pocket of my Scottware jacket. The higher resolution screen and more RAM were not sufficient to make me want to spend some money. The deciding factor is that as I m working on phones with touch screens it is a benefit to use a laptop with a touch screen so I can do more testing. The Yoga I bought was going cheap because the touch part of the touch screen is broken but the stylus still works, this is apparently a common failure mode of the Yoga. The Yoga has a brighter screen than the Carbon and seems to have better contrast. I think Lenovo had some newer technology for that generation of laptops or maybe my Carbon is slightly defective in that regard. It s a hazard of buying second hand that if something basically works but isn t quite as good as it should be then you will never know. I m happy with this purchase and I recommend that everyone who buys laptops secondhand the way I do only get 1440p or better displays. I ve currently got the Kitty terminal emulator [4] setup with 9 windows that each have 103 or 104 columns and 26 or 28 rows of text. That s a lot of terminals on a laptop screen!

28 January 2024

Niels Thykier: Annotating the Debian packaging directory

In my previous blog post Providing online reference documentation for debputy, I made a point about how debhelper documentation was suboptimal on account of being static rather than online. The thing is that debhelper is not alone in this problem space, even if it is a major contributor to the number of packaging files you have to to know about. If we look at the "competition" here such as Fedora and Arch Linux, they tend to only have one packaging file. While most Debian people will tell you a long list of cons about having one packaging file (such a Fedora's spec file being 3+ domain specific languages "mashed" into one file), one major advantage is that there is only "the one packaging file". You only need to remember where to find the documentation for one file, which is great when you are running on wetware with limited storage capacity. Which means as a newbie, you can dedicate less mental resources to tracking multiple files and how they interact and more effort understanding the "one file" at hand. I started by asking myself how can we in Debian make the packaging stack more accessible to newcomers? Spoiler alert, I dug myself into rabbit hole and ended up somewhere else than where I thought I was going. I started by wanting to scan the debian directory and annotate all files that I could with documentation links. The logic was that if debputy could do that for you, then you could spend more mental effort elsewhere. So I combined debputy's packager provided files detection with a static list of files and I quickly had a good starting point for debputy-based packages.
Adding (non-static) dpkg and debhelper files to the mix Now, I could have closed the topic here and said "Look, I did debputy files plus couple of super common files". But I decided to take it a bit further. I added support for handling some dpkg files like packager provided files (such as debian/substvars and debian/symbols). But even then, we all know that debhelper is the big hurdle and a major part of the omission... In another previous blog post (A new Debian package helper: debputy), I made a point about how debputy could list all auxiliary files while debhelper could not. This was exactly the kind of feature that I would need for this feature, if this feature was to cover debhelper. Now, I also remarked in that blog post that I was not willing to maintain such a list. Also, I may have ranted about static documentation being unhelpful for debhelper as it excludes third-party provided tooling. Fortunately, a recent update to dh_assistant had provided some basic plumbing for loading dh sequences. This meant that getting a list of all relevant commands for a source package was a lot easier than it used to be. Once you have a list of commands, it would be possible to check all of them for dh's NOOP PROMISE hints. In these hints, a command can assert it does nothing if a given pkgfile is not present. This lead to the new dh_assistant list-guessed-dh-config-files command that will list all declared pkgfiles and which helpers listed them. With this combined feature set in place, debputy could call dh_assistant to get a list of pkgfiles, pretend they were packager provided files and annotate those along with manpage for the relevant debhelper command. The exciting thing about letting debpputy resolve the pkgfiles is that debputy will resolve "named" files automatically (debhelper tools will only do so when --name is passed), so it is much more likely to detect named pkgfiles correctly too. Side note: I am going to ignore the elephant in the room for now, which is dh_installsystemd and its package@.service files and the wide-spread use of debian/foo.service where there is no package called foo. For the latter case, the "proper" name would be debian/pkg.foo.service. With the new dh_assistant feature done and added to debputy, debputy could now detect the ubiquitous debian/install file. Excellent. But less great was that the very common debian/docs file was not. Turns out that dh_installdocs cannot be skipped by dh, so it cannot have NOOP PROMISE hints. Meh... Well, dh_assistant could learn about a new INTROSPECTABLE marker in addition to the NOOP PROMISE and then I could sprinkle that into a few commands. Indeed that worked and meant that debian/postinst (etc.) are now also detectable. At this point, debputy would be able to identify a wide range of debhelper related configuration files in debian/ and at least associate each of them with one or more commands. Nice, surely, this would be a good place to stop, right...?
Adding more metadata to the files The debhelper detected files only had a command name and manpage URI to that command. It would be nice if we could contextualize this a bit more. Like is this file installed into the package as is like debian/pam or is it a file list to be processed like debian/install. To make this distinction, I could add the most common debhelper file types to my static list and then merge the result together. Except, I do not want to maintain a full list in debputy. Fortunately, debputy has a quite extensible plugin infrastructure, so added a new plugin feature to provide this kind of detail and now I can outsource the problem! I split my definitions into two and placed the generic ones in the debputy-documentation plugin and moved the debhelper related ones to debhelper-documentation. Additionally, third-party dh addons could provide their own debputy plugin to add context to their configuration files. So, this gave birth file categories and configuration features, which described each file on different fronts. As an example, debian/gbp.conf could be tagged as a maint-config to signal that it is not directly related to the package build but more of a tool or style preference file. On the other hand, debian/install and debian/debputy.manifest would both be tagged as a pkg-helper-config. Files like debian/pam were tagged as ppf-file for packager provided file and so on. I mentioned configuration features above and those were added because, I have had a beef with debhelper's "standard" configuration file format as read by filearray and filedoublearray. They are often considered simple to understand, but it is hard to know how a tool will actually read the file. As an example, consider the following:
  • Will the debhelper use filearray, filedoublearray or none of them to read the file? This topic has about 2 bits of entropy.
  • Will the config file be executed if it is marked executable assuming you are using the right compat level? If it is executable, does dh-exec allow renaming for this file? This topic adds 1 or 2 bit of entropy depending on the context.
  • Will the config file be subject to glob expansions? This topic sounds like a boolean but is a complicated mess. The globs can be handled either by debhelper as it parses the file for you. In this case, the globs are applied to every token. However, this is not what dh_install does. Here the last token on each line is supposed to be a directory and therefore not subject to globs. Therefore, dh_install does the globbing itself afterwards but only on part of the tokens. So that is about 2 bits of entropy more. Actually, it gets worse...
    • If the file is executed, debhelper will refuse to expand globs in the output of the command, which was a deliberate design choice by the original debhelper maintainer took when he introduced the feature in debhelper/8.9.12. Except, dh_install feature interacts with the design choice and does enable glob expansion in the tool output, because it does so manually after its filedoublearray call.
So these "simple" files have way too many combinations of how they can be interpreted. I figured it would be helpful if debputy could highlight these difference, so I added support for those as well. Accordingly, debian/install is tagged with multiple tags including dh-executable-config and dh-glob-after-execute. Then, I added a datatable of these tags, so it would be easy for people to look up what they meant. Ok, this seems like a closed deal, right...?
Context, context, context However, the dh-executable-config tag among other are only applicable in compat 9 or later. It does not seem newbie friendly if you are told that this feature exist, but then have to read in the extended description that that it actually does not apply to your package. This problem seems fixable. Thanks to dh_assistant, it is easy to figure out which compat level the package is using. Then tweak some metadata to enable per compat level rules. With that tags like dh-executable-config only appears for packages using compat 9 or later. Also, debputy should be able to tell you where packager provided files like debian/pam are installed. We already have the logic for packager provided files that debputy supports and I am already using debputy engine for detecting the files. If only the plugin provided metadata gave me the install pattern, debputy would be able tell you where this file goes in the package. Indeed, a bit of tweaking later and setting install-pattern to usr/lib/pam.d/ name , debputy presented me with the correct install-path with the package name placing the name placeholder. Now, I have been using debian/pam as an example, because debian/pam is installed into usr/lib/pam.d in compat 14. But in earlier compat levels, it was installed into etc/pam.d. Well, I already had an infrastructure for doing compat file tags. Off we go to add install-pattern to the complat level infrastructure and now changing the compat level would change the path. Great. (Bug warning: The value is off-by-one in the current version of debhelper. This is fixed in git) Also, while we are in this install-pattern business, a number of debhelper config files causes files to be installed into a fixed directory. Like debian/docs which causes file to be installed into /usr/share/docs/ package . Surely, we can expand that as well and provide that bit of context too... and done. (Bug warning: The code currently does not account for the main documentation package context) It is rather common pattern for people to do debian/foo.in files, because they want to custom generation of debian/foo. Which means if you have debian/foo you get "Oh, let me tell you about debian/foo ". Then you rename it to debian/foo.in and the result is "debian/foo.in is a total mystery to me!". That is suboptimal, so lets detect those as well as if they were the original file but add a tag saying that they are a generate template and which file we suspect it generates. Finally, if you use debputy, almost all of the standard debhelper commands are removed from the sequence, since debputy replaces them. It would be weird if these commands still contributed configuration files when they are not actually going to be invoked. This mostly happened naturally due to the way the underlying dh_assistant command works. However, any file mentioned by the debhelper-documentation plugin would still appear unfortunately. So off I went to filter the list of known configuration files against which dh_ commands that dh_assistant thought would be used for this package.
Wrapping it up I was several layers into this and had to dig myself out. I have ended up with a lot of data and metadata. But it was quite difficult for me to arrange the output in a user friendly manner. However, all this data did seem like it would be useful any tool that wants to understand more about the package. So to get out of the rabbit hole, I for now wrapped all of this into JSON and now we have a debputy tool-support annotate-debian-directory command that might be useful for other tools. To try it out, you can try the following demo: In another day, I will figure out how to structure this output so it is useful for non-machine consumers. Suggestions are welcome. :)
Limitations of the approach As a closing remark, I should probably remind people that this feature relies heavily on declarative features. These include:
  • When determining which commands are relevant, using Build-Depends: dh-sequence-foo is much more reliable than configuring it via the Turing complete configuration we call debian/rules.
  • When debhelper commands use NOOP promise hints, dh_assistant can "see" the config files listed those hints, meaning the file will at least be detected. For new introspectable hint and the debputy plugin, it is probably better to wait until the dust settles a bit before adding any of those.
You can help yourself and others to better results by using the declarative way rather than using debian/rules, which is the bane of all introspection!

Russell Coker: Links January 2024

Long Now has an insightful article about domestication that considers whether humans have evolved to want to control nature [1]. The OMG Elite hacker cable is an interesting device [2]. A Wifi device in a USB cable to allow remote control and monitoring of data transfer, including remote keyboard control and sniffing. Pity that USB-C cables have chips in them so you can t use a spark to remove unwanted chips from modern cables. David Brin s blog post The core goal of tyrants: The Red-Caesar Cult and a restored era of The Great Man has some insightful points about authoritarianism [3]. Ron Garret wrote an interesting argument against Christianity [4], and a follow-up titled Why I Don t Believe in Jesus [5]. He has a link to a well written article about the different theologies of Jesus and Paul [6]. Dimitri John Ledkov wrote an interesting blog post about how they reduced disk space for Ubuntu kernel packages and RAM for the initramfs phase of boot [7]. I hope this gets copied to Debian soon. Joey Hess wrote an interesting blog post about trying to make LLM systems produce bad code if trained on his code without permission [8]. Arstechnica has an interesting summary of research into the security of fingerprint sensors [9]. Not surprising that the products of the 3 vendors that supply almost all PC fingerprint readers are easy to compromise. Bruce Schneier wrote an insightful blog post about how AI will allow mass spying (as opposed to mass surveillance) [10]. ZDnet has an informative article How to Write Better ChatGPT Prompts in 5 Steps [11]. I sent this to a bunch of my relatives. AbortRetryFail has an interesting article about the Itanic Saga [12]. Erberus sounds interesting, maybe VLIW designs could give a good ration of instructions to power unlike the Itanium which was notorious for being power hungry. Bruce Schneier wrote an insightful article about AI and Trust [13]. We really need laws controlling these things! David Brin wrote an interesting blog post on the obsession with historical cycles [14].

24 January 2024

Louis-Philippe V ronneau: Montreal Subway Foot Traffic Data, 2023 edition

For the fifth year in a row, I've asked Soci t de Transport de Montr al, Montreal's transit agency, for the foot traffic data of Montreal's subway. By clicking on a subway station, you'll be redirected to a graph of the station's foot traffic. Licences

21 January 2024

Debian Brasil: MiniDebConf BH 2024 - patroc nio e financiamento coletivo

MiniDebConf BH 2024 J est rolando a inscri o de participante e a chamada de atividades para a MiniDebConf Belo Horizonte 2024, que acontecer de 27 a 30 de abril no Campus Pampulha da UFMG. Este ano estamos ofertando bolsas de alimenta o, hospedagem e passagens para contribuidores(as) ativos(as) do Projeto Debian. Patroc nio: Para a realiza o da MiniDebConf, estamos buscando patroc nio financeiro de empresas e entidades. Ent o se voc trabalha em uma empresa/entidade (ou conhece algu m que trabalha em uma) indique o nosso plano de patroc nio para ela. L voc ver os valores de cada cota e os seus benef cios. Financiamento coletivo: Mas voc tamb m pode ajudar a realiza o da MiniDebConf por meio do nosso financiamento coletivo! Fa a uma doa o de qualquer valor e tenha o seu nome publicado no site do evento como apoiador(a) da MiniDebConf Belo Horizonte 2024. Mesmo que voc n o pretenda vir a Belo Horizonte para participar do evento, voc pode doar e assim contribuir para o mais importante evento do Projeto Debian no Brasil. Contato Qualquer d vida, mande um email para contato@debianbrasil.org.br Organiza o Debian Brasil Debian Debian MG DCC

20 January 2024

Niels Thykier: Making debputy: Writing declarative parsing logic

In this blog post, I will cover how debputy parses its manifest and the conceptual improvements I did to make parsing of the manifest easier. All instructions to debputy are provided via the debian/debputy.manifest file and said manifest is written in the YAML format. After the YAML parser has read the basic file structure, debputy does another pass over the data to extract the information from the basic structure. As an example, the following YAML file:
manifest-version: "0.1"
installations:
  - install:
      source: foo
      dest-dir: usr/bin
would be transformed by the YAML parser into a structure resembling:
 
  "manifest-version": "0.1",
  "installations": [
      
       "install":  
         "source": "foo",
         "dest-dir": "usr/bin",
        
      
  ]
 
This structure is then what debputy does a pass on to translate this into an even higher level format where the "install" part is translated into an InstallRule. In the original prototype of debputy, I would hand-write functions to extract the data that should be transformed into the internal in-memory high level format. However, it was quite tedious. Especially because I wanted to catch every possible error condition and report "You are missing the required field X at Y" rather than the opaque KeyError: X message that would have been the default. Beyond being tedious, it was also quite error prone. As an example, in debputy/0.1.4 I added support for the install rule and you should allegedly have been able to add a dest-dir: or an as: inside it. Except I crewed up the code and debputy was attempting to look up these keywords from a dict that could never have them. Hand-writing these parsers were so annoying that it demotivated me from making manifest related changes to debputy simply because I did not want to code the parsing logic. When I got this realization, I figured I had to solve this problem better. While reflecting on this, I also considered that I eventually wanted plugins to be able to add vocabulary to the manifest. If the API was "provide a callback to extract the details of whatever the user provided here", then the result would be bad.
  1. Most plugins would probably throw KeyError: X or ValueError style errors for quite a while. Worst case, they would end on my table because the user would have a hard time telling where debputy ends and where the plugins starts. "Best" case, I would teach debputy to say "This poor error message was brought to you by plugin foo. Go complain to them". Either way, it would be a bad user experience.
  2. This even assumes plugin providers would actually bother writing manifest parsing code. If it is that difficult, then just providing a custom file in debian might tempt plugin providers and that would undermine the idea of having the manifest be the sole input for debputy.
So beyond me being unsatisfied with the current situation, it was also clear to me that I needed to come up with a better solution if I wanted externally provided plugins for debputy. To put a bit more perspective on what I expected from the end result:
  1. It had to cover as many parsing errors as possible. An error case this code would handle for you, would be an error where I could ensure it sufficient degree of detail and context for the user.
  2. It should be type-safe / provide typing support such that IDEs/mypy could help you when you work on the parsed result.
  3. It had to support "normalization" of the input, such as
           # User provides
           - install: "foo"
           # Which is normalized into:
           - install:
               source: "foo"
4) It must be simple to tell  debputy  what input you expected.
At this point, I remembered that I had seen a Python (PYPI) package where you could give it a TypedDict and an arbitrary input (Sadly, I do not remember the name). The package would then validate the said input against the TypedDict. If the match was successful, you would get the result back casted as the TypedDict. If the match was unsuccessful, the code would raise an error for you. Conceptually, this seemed to be a good starting point for where I wanted to be. Then I looked a bit on the normalization requirement (point 3). What is really going on here is that you have two "schemas" for the input. One is what the programmer will see (the normalized form) and the other is what the user can input (the manifest form). The problem is providing an automatic normalization from the user input to the simplified programmer structure. To expand a bit on the following example:
# User provides
- install: "foo"
# Which is normalized into:
- install:
    source: "foo"
Given that install has the attributes source, sources, dest-dir, as, into, and when, how exactly would you automatically normalize "foo" (str) into source: "foo"? Even if the code filtered by "type" for these attributes, you would end up with at least source, dest-dir, and as as candidates. Turns out that TypedDict actually got this covered. But the Python package was not going in this direction, so I parked it here and started looking into doing my own. At this point, I had a general idea of what I wanted. When defining an extension to the manifest, the plugin would provide debputy with one or two definitions of TypedDict. The first one would be the "parsed" or "target" format, which would be the normalized form that plugin provider wanted to work on. For this example, lets look at an earlier version of the install-examples rule:
# Example input matching this typed dict.
#    
#       "source": ["foo"]
#       "into": ["pkg"]
#    
class InstallExamplesTargetFormat(TypedDict):
    # Which source files to install (dest-dir is fixed)
    sources: List[str]
    # Which package(s) that should have these files installed.
    into: NotRequired[List[str]]
In this form, the install-examples has two attributes - both are list of strings. On the flip side, what the user can input would look something like this:
# Example input matching this typed dict.
#    
#       "source": "foo"
#       "into": "pkg"
#    
#
class InstallExamplesManifestFormat(TypedDict):
    # Note that sources here is split into source (str) vs. sources (List[str])
    sources: NotRequired[List[str]]
    source: NotRequired[str]
    # We allow the user to write  into: foo  in addition to  into: [foo] 
    into: Union[str, List[str]]
FullInstallExamplesManifestFormat = Union[
    InstallExamplesManifestFormat,
    List[str],
    str,
]
The idea was that the plugin provider would use these two definitions to tell debputy how to parse install-examples. Pseudo-registration code could look something like:
def _handler(
    normalized_form: InstallExamplesTargetFormat,
) -> InstallRule:
    ...  # Do something with the normalized form and return an InstallRule.
concept_debputy_api.add_install_rule(
  keyword="install-examples",
  target_form=InstallExamplesTargetFormat,
  manifest_form=FullInstallExamplesManifestFormat,
  handler=_handler,
)
This was my conceptual target and while the current actual API ended up being slightly different, the core concept remains the same.
From concept to basic implementation Building this code is kind like swallowing an elephant. There was no way I would just sit down and write it from one end to the other. So the first prototype of this did not have all the features it has now. Spoiler warning, these next couple of sections will contain some Python typing details. When reading this, it might be helpful to know things such as Union[str, List[str]] being the Python type for either a str (string) or a List[str] (list of strings). If typing makes your head spin, these sections might less interesting for you. To build this required a lot of playing around with Python's introspection and typing APIs. My very first draft only had one "schema" (the normalized form) and had the following features:
  • Read TypedDict.__required_attributes__ and TypedDict.__optional_attributes__ to determine which attributes where present and which were required. This was used for reporting errors when the input did not match.
  • Read the types of the provided TypedDict, strip the Required / NotRequired markers and use basic isinstance checks based on the resulting type for str and List[str]. Again, used for reporting errors when the input did not match.
This prototype did not take a long (I remember it being within a day) and worked surprisingly well though with some poor error messages here and there. Now came the first challenge, adding the manifest format schema plus relevant normalization rules. The very first normalization I did was transforming into: Union[str, List[str]] into into: List[str]. At that time, source was not a separate attribute. Instead, sources was a Union[str, List[str]], so it was the only normalization I needed for all my use-cases at the time. There are two problems when writing a normalization. First is determining what the "source" type is, what the target type is and how they relate. The second is providing a runtime rule for normalizing from the manifest format into the target format. Keeping it simple, the runtime normalizer for Union[str, List[str]] -> List[str] was written as:
def normalize_into_list(x: Union[str, List[str]]) -> List[str]:
    return x if isinstance(x, list) else [x]
This basic form basically works for all types (assuming none of the types will have List[List[...]]). The logic for determining when this rule is applicable is slightly more involved. My current code is about 100 lines of Python code that would probably lose most of the casual readers. For the interested, you are looking for _union_narrowing in declarative_parser.py With this, when the manifest format had Union[str, List[str]] and the target format had List[str] the generated parser would silently map a string into a list of strings for the plugin provider. But with that in place, I had covered the basics of what I needed to get started. I was quite excited about this milestone of having my first keyword parsed without handwriting the parser logic (at the expense of writing a more generic parse-generator framework).
Adding the first parse hint With the basic implementation done, I looked at what to do next. As mentioned, at the time sources in the manifest format was Union[str, List[str]] and I considered to split into a source: str and a sources: List[str] on the manifest side while keeping the normalized form as sources: List[str]. I ended up committing to this change and that meant I had to solve the problem getting my parser generator to understand the situation:
# Map from
class InstallExamplesManifestFormat(TypedDict):
    # Note that sources here is split into source (str) vs. sources (List[str])
    sources: NotRequired[List[str]]
    source: NotRequired[str]
    # We allow the user to write  into: foo  in addition to  into: [foo] 
    into: Union[str, List[str]]
# ... into
class InstallExamplesTargetFormat(TypedDict):
    # Which source files to install (dest-dir is fixed)
    sources: List[str]
    # Which package(s) that should have these files installed.
    into: NotRequired[List[str]]
There are two related problems to solve here:
  1. How will the parser generator understand that source should be normalized and then mapped into sources?
  2. Once that is solved, the parser generator has to understand that while source and sources are declared as NotRequired, they are part of a exactly one of rule (since sources in the target form is Required). This mainly came down to extra book keeping and an extra layer of validation once the previous step is solved.
While working on all of this type introspection for Python, I had noted the Annotated[X, ...] type. It is basically a fake type that enables you to attach metadata into the type system. A very random example:
# For all intents and purposes,  foo  is a string despite all the  Annotated  stuff.
foo: Annotated[str, "hello world"] = "my string here"
The exciting thing is that you can put arbitrary details into the type field and read it out again in your introspection code. Which meant, I could add "parse hints" into the type. Some "quick" prototyping later (a day or so), I got the following to work:
# Map from
#      
#        "source": "foo"  # (or "sources": ["foo"])
#        "into": "pkg"
#      
class InstallExamplesManifestFormat(TypedDict):
    # Note that sources here is split into source (str) vs. sources (List[str])
    sources: NotRequired[List[str]]
    source: NotRequired[
        Annotated[
            str,
            DebputyParseHint.target_attribute("sources")
        ]
    ]
    # We allow the user to write  into: foo  in addition to  into: [foo] 
    into: Union[str, List[str]]
# ... into
#      
#        "source": ["foo"]
#        "into": ["pkg"]
#      
class InstallExamplesTargetFormat(TypedDict):
    # Which source files to install (dest-dir is fixed)
    sources: List[str]
    # Which package(s) that should have these files installed.
    into: NotRequired[List[str]]
Without me (as a plugin provider) writing a line of code, I can have debputy rename or "merge" attributes from the manifest form into the normalized form. Obviously, this required me (as the debputy maintainer) to write a lot code so other me and future plugin providers did not have to write it.
High level typing At this point, basic normalization between one mapping to another mapping form worked. But one thing irked me with these install rules. The into was a list of strings when the parser handed them over to me. However, I needed to map them to the actual BinaryPackage (for technical reasons). While I felt I was careful with my manual mapping, I knew this was exactly the kind of case where a busy programmer would skip the "is this a known package name" check and some user would typo their package resulting in an opaque KeyError: foo. Side note: "Some user" was me today and I was super glad to see debputy tell me that I had typoed a package name (I would have been more happy if I had remembered to use debputy check-manifest, so I did not have to wait through the upstream part of the build that happened before debhelper passed control to debputy...) I thought adding this feature would be simple enough. It basically needs two things:
  1. Conversion table where the parser generator can tell that BinaryPackage requires an input of str and a callback to map from str to BinaryPackage. (That is probably lie. I think the conversion table came later, but honestly I do remember and I am not digging into the git history for this one)
  2. At runtime, said callback needed access to the list of known packages, so it could resolve the provided string.
It was not super difficult given the existing infrastructure, but it did take some hours of coding and debugging. Additionally, I added a parse hint to support making the into conditional based on whether it was a single binary package. With this done, you could now write something like:
# Map from
class InstallExamplesManifestFormat(TypedDict):
    # Note that sources here is split into source (str) vs. sources (List[str])
    sources: NotRequired[List[str]]
    source: NotRequired[
        Annotated[
            str,
            DebputyParseHint.target_attribute("sources")
        ]
    ]
    # We allow the user to write  into: foo  in addition to  into: [foo] 
    into: Union[BinaryPackage, List[BinaryPackage]]
# ... into
class InstallExamplesTargetFormat(TypedDict):
    # Which source files to install (dest-dir is fixed)
    sources: List[str]
    # Which package(s) that should have these files installed.
    into: NotRequired[
        Annotated[
            List[BinaryPackage],
            DebputyParseHint.required_when_multi_binary()
        ]
    ]
Code-wise, I still had to check for into being absent and providing a default for that case (that is still true in the current codebase - I will hopefully fix that eventually). But I now had less room for mistakes and a standardized error message when you misspell the package name, which was a plus.
The added side-effect - Introspection A lovely side-effect of all the parsing logic being provided to debputy in a declarative form was that the generated parser snippets had fields containing all expected attributes with their types, which attributes were required, etc. This meant that adding an introspection feature where you can ask debputy "What does an install rule look like?" was quite easy. The code base already knew all of this, so the "hard" part was resolving the input the to concrete rule and then rendering it to the user. I added this feature recently along with the ability to provide online documentation for parser rules. I covered that in more details in my blog post Providing online reference documentation for debputy in case you are interested. :)
Wrapping it up This was a short insight into how debputy parses your input. With this declarative technique:
  • The parser engine handles most of the error reporting meaning users get most of the errors in a standard format without the plugin provider having to spend any effort on it. There will be some effort in more complex cases. But the common cases are done for you.
  • It is easy to provide flexibility to users while avoiding having to write code to normalize the user input into a simplified programmer oriented format.
  • The parser handles mapping from basic types into higher forms for you. These days, we have high level types like FileSystemMode (either an octal or a symbolic mode), different kind of file system matches depending on whether globs should be performed, etc. These types includes their own validation and parsing rules that debputy handles for you.
  • Introspection and support for providing online reference documentation. Also, debputy checks that the provided attribute documentation covers all the attributes in the manifest form. If you add a new attribute, debputy will remind you if you forget to document it as well. :)
In this way everybody wins. Yes, writing this parser generator code was more enjoyable than writing the ad-hoc manual parsers it replaced. :)

18 January 2024

Russell Coker: LicheePi 4A (RISC-V) First Look

I Just bought a LicheePi 4A RISC-V embedded computer (like a RaspberryPi but with a RISC-V CPU) for $322.68 from Aliexpress (the official site for buying LicheePi devices). Here is the Sipheed web page about it and their other recent offerings [1]. I got the version with 16G of RAM and 128G of storage, I probably don t need that much storage (I can use NFS or USB) but 16G of RAM is good for VMs. Here is the Wiki about this board [2]. Configuration When you get one of these devices you should make setting up ssh server your first priority. I found the HDMI output to be very unreliable. The first monitor I tried was a Samsung 4K monitor dating from when 4K was a new thing, the LicheePi initially refused to operate at a resolution higher than 1024*768 but later on switched to 4K resolution when resuming from screen-blank for no apparent reason (and the window manager didn t support this properly). On the Dell 4K monitor I use on my main workstation it sometimes refused to talk to it and occasionally worked. I got it running at 1920*1080 without problems and then switched it to 4K and it lost video sync and never talked to that monitor again. On my Desklab portabable 4K monitor I got it to display in 4K resolution but only the top left 1/4 of the screen displayed. The issues with HDMI monitor support greatly limit the immediate potential for using this as a workstation. It doesn t make it impossible but would be fiddly at best. It s quite likely that a future OS update will fix this. But at the moment it s best used as a server. The LicheePi has a custom Linux distribution based on Ubuntu so you want too put something like the following in /etc/network/interfaces to make it automatically connect to the ethernet when plugged in:
auto end0
iface end0 inet dhcp
Then to get sshd to start you have to run the following commands to generate ssh host keys that aren t zero bytes long:
rm /etc/ssh/ssh_host_*
systemctl restart ssh.service
It appears to have wifi hardware but the OS doesn t recognise it. This isn t a priority for me as I mostly want to use it as a server. Performance For the first test of performance I created a 100MB file from /dev/urandom and then tried compressing it on various systems. With zstd -9 it took 16.893 user seconds on the LicheePi4A, 0.428s on my Thinkpad X1 Carbon Gen5 with a i5-6300U CPU (Debian/Unstable), 1.288s on my E5-2696 v3 workstation (Debian/Bookworm), 0.467s on the E5-2696 v3 running Debian/Unstable, 2.067s on a E3-1271 v3 server, and 7.179s on the E3-1271 v3 system emulating a RISC-V system via QEMU running Debian/Unstable. It s very impressive that the QEMU emulation is fast enough that emulating a different CPU architecture is only 3.5* slower for this test (or maybe 10* slower if it was running Debian/Unstable on the AMD64 code)! The emulated RISC-V is also more than twice as fast as real RISC-V hardware and probably of comparable speed to real RISC-V hardware when running the same versions (and might be slightly slower if running the same version of zstd) which is a tribute to the quality of emulation. One performance issue that most people don t notice is the time taken to negotiate ssh sessions. It s usually not noticed because the common CPUs have got faster at about the same rate as the algorithms for encryption and authentication have become more complex. On my i5-6300U laptop it takes 0m0.384s to run ssh -i ~/.ssh/id_ed25519 localhost id with the below server settings (taken from advice on ssh-audit.com [3] for a secure ssh configuration). On the E3-1271 v3 server it is 0.336s, on the QMU system it is 28.022s, and on the LicheePi it is 0.592s. By this metric the LicheePi is about 80% slower than decent x86 systems and the QEMU emulation of RISC-V is 73* slower than the x86 system it runs on. Does crypto depend on instructions that are difficult to emulate?
HostKey /etc/ssh/ssh_host_ed25519_key
KexAlgorithms -ecdh-sha2-nistp256,ecdh-sha2-nistp384,ecdh-sha2-nistp521,diffie-hellman-group14-sha256
MACs -umac-64-etm@openssh.com,hmac-sha1-etm@openssh.com,umac-64@openssh.com,umac-128@openssh.com,hmac-sha2-256,hmac-sha2-512,hmac-sha1
I haven t yet tested the performance of Ethernet (what routing speed can you get through the 2 gigabit ports?), emmc storage, and USB. At the moment I ve been focused on using RISC-V as a test and development platform. My conclusion is that I m glad I don t plan to compile many kernels or anything large like LibreOffice. But that for typical development that I do it will be quite adequate. The speed of Chromium seems adequate in basic tests, but the video output hasn t worked reliably enough to do advanced tests. Hardware Features Having two Gigabit Ethernet ports, 4 USB-3 ports, and Wifi on board gives some great options for using this as a router. It s disappointing that they didn t go with 2.5Gbit as everyone seems to be doing that nowadays but Gigabit is enough for most things. Having only a single HDMI port and not supporting USB-C docks (the USB-C port appears to be power only) limits what can be done for workstation use and for controlling displays. I know of people using small ARM computers attached to the back of large TVs for advertising purposes and that isn t going to be a great option for this. The CPU and RAM apparently uses a lot of power (which is relative the entire system draws up to 2A at 5V so the CPU would be something below 5W). To get this working a cooling fan has to be stuck to the CPU and RAM chips via a layer of thermal stuff that resembles a fine sheet of blu-tack in both color and stickyness. I am disappointed that there isn t any more solid form of construction, to mount this on a wall or ceiling some extra hardware would be needed to secure this. Also if they just had a really big copper heatsink I think that would be better. 80386 CPUs with similar TDP were able to run without a fan. I wonder how things would work with all USB ports in use. It s expected that a USB port can supply a minimum of 2.5W which means that all the ports could require 10W if they were active. Presumably something significantly less than 5W is available for the USB ports. Other Devices Sipheed has a range of other devices in the works. They currently sell the LicheeCluster4A which support 7 compute modules for a cluster in a box. This has some interesting potential for testing and demonstrating cluster software but you could probably buy an AMD64 system with more compute power for less money. The Lichee Console 4A is a tiny laptop which could be useful for people who like the 7 laptop form factor, unfortunately it only has a 1280*800 display if it had the same resolution display as a typical 7 phone I would have bought one. The next device that appeals to me is the soon to be released Lichee Pad 4A which is a 10.1 tablet with 1920*1200 display, Wifi6, Bluetooth 5.4, and 16G of RAM. It also has 1 USB-C connection, 2*USB-3 sockets, and support for an external card with 2*Gigabit ethernet. It s a tablet as a laptop without keyboard instead of the more common larger phone design model. They are also about to release the LicheePadMax4A which is similar to the other tablet but with a 14 2240*1400 display and which ships with a keyboard to make it essentially a laptop with detachable keyboard. Conclusion At this time I wouldn t recommend that this device be used as a workstation or laptop, although the people who want to do such things will probably do it anyway regardless of my recommendations. I think it will be very useful as a test system for RISC-V development. I have some friends who are interested in this sort of thing and I can give them VMs. It is a bit expensive. The Sipheed web site boasts about the LicheePi4 being faster than the RaspberryPi4, but it s not a lot faster and the RaspberryPi4 is much cheaper ($127 or $129 for one with 8G of RAM). The RaspberryPi4 has two HDMI ports but a limit of 8G of RAM while the LicheePi has up to 16G of RAM and two Gigabit Ethernet ports but only a single HDMI port. It seems that the RaspberryPi4 might win if you want a cheap low power desktop system. At this time I think the reason for this device is testing out RISC-V as an alternative to the AMD64 and ARM64 architectures. An open CPU architecture goes well with free software, but it isn t just people who are into FOSS who are testing such things. I know some corporations are trying out RISC-V as a way of getting other options for embedded systems that don t involve paying monopolists. The Lichee Console 4A is probably a usable tiny laptop if the resolution is sufficient for your needs. As an aside I predict that the tiny laptop or pocket computer segment will take off in the near future. There are some AMD64 systems the size of a phone but thicker that run Windows and go for reasonable prices on AliExpress. Hopefully in the near future this device will have better video drivers and be usable as a small and quiet workstation. I won t rule out the possibility of making this my main workstation in the not too distant future, all it needs is reliable 4K display and the ability to decode 4K video. It s performance for web browsing and as an ssh client seems adequate, and that s what matters for my workstation use. But for the moment it s just for server use.

17 January 2024

Colin Watson: Task management

Now that I m freelancing, I need to actually track my time, which is something I ve had the luxury of not having to do before. That meant something of a rethink of the way I ve been keeping track of my to-do list. Up to now that was a combination of things like the bug lists for the projects I m working on at the moment, whatever task tracking system Canonical was using at the moment (Jira when I left), and a giant flat text file in which I recorded logbook-style notes of what I d done each day plus a few extra notes at the bottom to remind myself of particularly urgent tasks. I could have started manually adding times to each logbook entry, but ugh, let s not. In general, I had the following goals (which were a bit reminiscent of my address book): I didn t do an elaborate evaluation of multiple options, because I m not trying to come up with the best possible solution for a client here. Also, there are a bazillion to-do list trackers out there and if I tried to evaluate them all I d never do anything else. I just wanted something that works well enough for me. Since it came up on Mastodon: a bunch of people swear by Org mode, which I know can do at least some of this sort of thing. However, I don t use Emacs and don t plan to use Emacs. nvim-orgmode does have some support for time tracking, but when I ve tried vim-based versions of Org mode in the past I ve found they haven t really fitted my brain very well. Taskwarrior and Timewarrior One of the other Freexian collaborators mentioned Taskwarrior and Timewarrior, so I had a look at those. The basic idea of Taskwarrior is that you have a task command that tracks each task as a blob of JSON and provides subcommands to let you add, modify, and remove tasks with a minimum of friction. task add adds a task, and you can add metadata like project:Personal (I always make sure every task has a project, for ease of filtering). Just running task shows you a task list sorted by Taskwarrior s idea of urgency, with an ID for each task, and there are various other reports with different filtering and verbosity. task <id> annotate lets you attach more information to a task. task <id> done marks it as done. So far so good, so a redacted version of my to-do list looks like this:
$ task ls
ID A Project     Tags                 Description
17   Freexian                         Add Incus support to autopkgtest [2]
 7   Columbiform                      Figure out Lloyds online banking [1]
 2   Debian                           Fix troffcvt for groff 1.23.0 [1]
11   Personal                         Replace living room curtain rail
Once I got comfortable with it, this was already a big improvement. I haven t bothered to learn all the filtering gadgets yet, but it was easy enough to see that I could do something like task all project:Personal and it d show me both pending and completed tasks in that project, and that all the data was stored in ~/.task - though I have to say that there are enough reporting bells and whistles that I haven t needed to poke around manually. In combination with the regular backups that I do anyway (you do too, right?), this gave me enough confidence to abandon my previous text-file logbook approach. Next was time tracking. Timewarrior integrates with Taskwarrior, albeit in an only semi-packaged way, and it was easy enough to set that up. Now I can do:
$ task 25 start
Starting task 00a9516f 'Write blog post about task tracking'.
Started 1 task.
Note: '"Write blog post about task tracking"' is a new tag.
Tracking Columbiform "Write blog post about task tracking"
  Started 2024-01-10T11:28:38
  Current                  38
  Total               0:00:00
You have more urgent tasks.
Project 'Columbiform' is 25% complete (3 of 4 tasks remaining).
When I stop work on something, I do task active to find the ID, then task <id> stop. Timewarrior does the tedious stopwatch business for me, and I can manually enter times if I forget to start/stop a task. Then the really useful bit: I can do something like timew summary :month <name-of-client> and it tells me how much to bill that client for this month. Perfect. I also started using VIT to simplify the day-to-day flow a little, which means I m normally just using one or two keystrokes rather than typing longer commands. That isn t really necessary from my point of view, but it does save some time. Android integration I left Android integration for a bit later since it wasn t essential. When I got round to it, I have to say that it felt a bit clumsy, but it did eventually work. The first step was to set up a taskserver. Most of the setup procedure was OK, but I wanted to use Let s Encrypt to minimize the amount of messing around with CAs I had to do. Getting this to work involved hitting things with sticks a bit, and there s still a local CA involved for client certificates. What I ended up with was a certbot setup with the webroot authenticator and a custom deploy hook as follows (with cert_name replaced by a DNS name in my house domain):
#! /bin/sh
set -eu
cert_name=taskd.example.org
found=false
for domain in $RENEWED_DOMAINS; do
    case "$domain" in
        $cert_name)
            found=:
            ;;
    esac
done
$found   exit 0
install -m 644 "/etc/letsencrypt/live/$cert_name/fullchain.pem" \
    /var/lib/taskd/pki/fullchain.pem
install -m 640 -g Debian-taskd "/etc/letsencrypt/live/$cert_name/privkey.pem" \
    /var/lib/taskd/pki/privkey.pem
systemctl restart taskd.service
I could then set this in /etc/taskd/config (server.crl.pem and ca.cert.pem were generated using the documented taskserver setup procedure):
server.key=/var/lib/taskd/pki/privkey.pem
server.cert=/var/lib/taskd/pki/fullchain.pem
server.crl=/var/lib/taskd/pki/server.crl.pem
ca.cert=/var/lib/taskd/pki/ca.cert.pem
Then I could set taskd.ca on my laptop to /usr/share/ca-certificates/mozilla/ISRG_Root_X1.crt and otherwise follow the client setup instructions, run task sync init to get things started, and then task sync every so often to sync changes between my laptop and the taskserver. I used TaskWarrior Mobile as the client. I have to say I wouldn t want to use that client as my primary task tracking interface: the setup procedure is clunky even beyond the necessity of copying a client certificate around, it expects you to give it a .taskrc rather than having a proper settings interface for that, and it only seems to let you add a task if you specify a due date for it. It also lacks Timewarrior integration, so I can only really use it when I don t care about time tracking, e.g. personal tasks. But that s really all I need, so it meets my minimum requirements. Next? Considering this is literally the first thing I tried, I have to say I m pretty happy with it. There are a bunch of optional extras I haven t tried yet, but in general it kind of has the vim nature for me: if I need something it s very likely to exist or easy enough to build, but the features I don t use don t get in my way. I wouldn t recommend any of this to somebody who didn t already spend most of their time in a terminal - but I do. I m glad people have gone to all the effort to build this so I didn t have to.

16 January 2024

Thomas Koch: Missing memegen

Posted on May 1, 2022
Back at $COMPANY we had an internal meme-site. I had some reputation in my team for creating good memes. When I watched Episode 3 of Season 2 from Yes Premier Minister yesterday, I really missed a place to post memes. This is the full scene. Please watch it or even the full episode before scrolling down to the GIFs. I had a good laugh for some time. With Debian, I could just download the episode from somewhere on the net with youtube-dl and easily create two GIFs using ffmpeg, with and without subtitle:
ffmpeg  -ss 0:5:59.600 -to 0:6:11.150 -i Downloads/Yes.Prime.Minister.S02E03-1254485068289.mp4 tmp/tragic.gif
ffmpeg  -ss 0:5:59.600 -to 0:6:11.150 -i Downloads/Yes.Prime.Minister.S02E03-1254485068289.mp4 \
        -vf "subtitles=tmp/sub.srt:force_style='Fontsize=60'" tmp/tragic_with_subtitle.gif
And this sub.srt file:
1
00:00:10,000 --> 00:00:12,000
Tragic.
I believe, one needs to install the libavfilter-extra variant to burn the subtitle in the GIF. Some space to hide the GIFs. The Premier Minister just learned, that his predecessor, who was about to publish embarassing memories, died of a sudden heart attack: I can t actually think of a meme with this GIF, that the internal thought police community moderation would not immediately take down. For a moment I thought that it would be fun to have a Meme-Site for Debian members. But it is probably not the right time for this. Maybe somebody likes the above GIFs though and wants to use them somewhere.

Next.

Previous.